Search CORE

401 research outputs found

Integrated region- and pixel-based approach to background modelling

Author: M. Bicego
M. Cristani
V. Murino
Publication venue
Publication date: 01/01/2002
Field of study

In this paper a new probabilistic method for background modelling is proposed, aimed at the application in video surveillance tasks using a monitoring static camera. Recently, methods employing Time-Adaptive, Per Pixel, Mixture of Gaussians (TAPPMOG) modelling have become popular due to their intrinsic appealing properties. Nevertheless, they are not able per se to monitor global changes in the scene, because they model the background as a set of independent pixel processes. In this paper, we propose to integrate this kind of pixel-based information with higher level region-based information, that permits to manage also sudden changes of the background. These pixel- and regionbased modules are naturally and effectively embedded in a probabilistic Bayesian framework called particle filtering, that allows a multi-object tracking. Experimental comparison with a classic pixel-based approach reveals that the proposed method is really effective in recovering from situations of sudden global illumination changes of the background, as well as limited non-uniform changes of the scene illumination.

CiteSeerX

Catalogo dei prodotti della ricerca

ACOUSTIC RANGE IMAGE SEGMENTATION BY EFFECTIVE MEAN SHIFT

Author: M. Cristani
U. Castellani
V. Murino
Publication venue
Publication date: 01/01/2006
Field of study

Image perception in underwater environment is a difficult task for a human operator, and data segmentation becomes a crucial step toward an higher level interpretation and recognition of the observing scenarios. This paper contributes to the related state of the art, by fitting the mean shift clustering paradigm to the segmentation of acoustical range images, providing a segmentation approach in which whatever parameter tuning is absent. Moreover, the method exploits actively the connectivity information provided by the range map, by using reverse projection as acceleration technique. Therefore, the method is able to produce, starting from raw range data, meaningful segmented clouds of points in a fully automatic and efficient fashion. 1

CiteSeerX

Crossref

Catalogo dei prodotti della ricerca

Scalable and Compact 3D Action Recognition with Approximated RBF Kernel Machines

Author: Cavazza J.
Morerio P.
Murino V.
Publication venue
Publication date: 01/01/2019
Field of study

Despite the recent deep learning (DL) revolution, kernel machines still remain powerful methods for action recognition. DL has brought the use of large datasets and this is typically a problem for kernel approaches, which are not scaling up eciently due to kernel Gram matrices. Nevertheless, kernel methods are still attractive and more generally applicable since they can equally manage dierent sizes of the datasets, also in cases where DL techniques show some limitations. This work investigates these issues by proposing an explicit ap- proximated representation that, together with a linear model, is an equivalent, yet scalable, implementation of a kernel machine. Our approximation is directly inspired by the exact feature map that is induced by an RBF Gaussian kernel but, unlike the latter, it is nite dimensional and very compact. We justify the soundness of our idea with a theoretical analysis which proves the unbiasedness of the approximation, and provides a vanishing bound for its variance, which is shown to decrease much rapidly than in alternative methods in the literature. In a broad experimental validation, we assess the superiority of our approximation in terms of 1) ease and speed of training, 2) compactness of the model, and 3) improvements with respect to the state-of-the-art performance

Catalogo dei prodotti della ricerca

Enhancing visual embeddings through weakly supervised captioning for zero-shot learning

Author: Bustreo M.
Cavazza J.
Murino V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Visual features designed for image classification have shown to be useful in zero-shot learning (ZSL) when generalizing towards classes not seen during training. In this paper, we argue that a more effective way of building visual features for ZSL is to extract them through captioning, in order not just to classify an image but, instead, to describe it. However, modern captioning models rely on a massive level of supervision, e.g up to 15 extended descriptions per instance provided by humans, which is simply not available for ZSL benchmarks. In the latter in fact, the available annotations inform about the presence/absence of attributes within a fixed list only. Worse, attributes are seldom annotated at the image level, but rather, at the class level only: because of this, the annotation cannot be visually grounded. In this paper, we deal with such a weakly supervised regime to train an end-to-end LSTM captioner, whose backbone CNN image encoder can provide better features for ZSL. Our enhancement of visual features, called 'VisEn', is compatible with any generic ZSL method, without requiring changes in its pipeline (a part from adapting hyper-parameters). Experimentally, VisEn is capable of sharply improving recognition performance on unseen classes, as we demonstrate thorough an ablation study which encompasses different ZSL approaches. Further, on the challenging fine-grained CUB dataset, VisEn improves by margin state-of-the-art methods, by using visual descriptors of one order of magnitude smaller

Archivio istituzionale della ricerca - Università di Genova

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

Author: Bazzani L
Minh Hq
Murino V
Publication venue
Publication date: 01/01/2016
Field of study

This paper presents a general vector-valued reproducing kernel Hilbert spaces (RKHS) framework for the problem of learning an unknown functional dependency between a structured input space and a structured output space. Our formulation encompasses both Vector-valued Manifold Regularization and Co-regularized Multi-view Learning, providing in particular a unifying framework linking these two important learning approaches. In the case of the least square loss function, we provide a closed form solution, which is obtained by solving a system of linear equations. In the case of Support Vector Machine (SVM) classi fi cation, our formulation generalizes in particular both the binary Laplacian SVM to the multi-class, multi-view settings and the multi-class Simplex Cone SVM to the semisupervised, multi-view settings. The solution is obtained by solving a single quadratic optimization problem, as in standard SVM, via the Sequential Minimal Optimization (SMO) approach. Empirical results obtained on the task of object recognition, using several challenging data sets, demonstrate the competitiveness of our algorithms compared with other state-of-the-art methods

Catalogo dei prodotti della ricerca

Identification of emergent leaders in a meeting scenario using multiple kernel learning

Author: Becchio C.
Beyan C.
Capozzi F.
Murino V.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Institutional Research Information System University of Turin

Stel component analysis: Modeling spatial correlations in image class structure

Author: A. Perina
B. Frey
M. Cristani
N. Jojic
V. Murino
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Crossref

Adaptation of Person Re-identification Models for On-boarding New Camera(s)

Author: Bhuiyan A.
Murino V.
Panda R.
Roy-Chowdhury A. K.
Publication venue
Publication date: 01/01/2019
Field of study

Existing approaches for person re-identification have concentrated on either designing the best feature representation or learning optimal matching metrics in a static setting where the number of cameras are fixed in a network. Most approaches have neglected the dynamic and open world nature of the re- identification problem, where one or multiple new cameras may be temporarily on-boarded into an ex- isting system to get additional information or added to expand an existing network. To address such a very practical problem, we propose a novel approach for adapting existing multi-camera re-identification frameworks with limited supervision. First, we formulate a domain perceptive re-identification method based on geodesic flow kernel that can effectively find the best source camera (already installed) to adapt with newly introduced target camera(s), without requiring a very expensive training phase. Second, we introduce a transitive inference algorithm for re-identification that can exploit the information from best source camera to improve the accuracy across other camera pairs in a network of multiple cameras. Third, we develop a target-aware sparse prototype selection strategy for finding an informative subset of source camera data for data-efficient learning in resource constrained environments. Our approach can greatly increase the flexibility and reduce the deployment cost of new cameras in many real-world dy- namic camera networks. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art unsupervised alternatives whilst being extremely efficient to compute

Catalogo dei prodotti della ricerca

Intra-Camera Supervised Person Re-Identification: A New Benchmark

Author: Gong S
IEEE
Li M
Murino V
Zhu X
Zhu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Existing person re-identification (re-id) methods rely mostly on a large set of inter-camera identity labelled training data, requiring a tedious data collection and annotation process therefore leading to poor scalability in practical re-id applications. To overcome this fundamental limitation, we consider person re-identification without inter-camera identity association but only with identity labels independently annotated within each individual camera-view. This eliminates the most time-consuming and tedious inter-camera identity labelling process in order to significantly reduce the amount of human efforts required during annotation. It hence gives rise to a more scalable and more feasible learning scenario, which we call Intra-Camera Supervised (ICS) person re-id. Under this ICS setting with weaker label supervision, we formulate a Multi-Task Multi-Label (MTML) deep learning method. Given no inter-camera association, MTML is specially designed for self-discovering the inter-camera identity correspondence. This is achieved by inter-camera multi-label learning under a joint multi-task inference framework. In addition, MTML can also efficiently learn the discriminative re-id feature representations by fully using the available identity labels within each camera-view. Extensive experiments demonstrate the performance superiority of our MTML model over the state-of-the-art alternative methods on three large-scale person re-id datasets in the proposed intra-camera supervised learning setting.Comment: 9 pages, 3 figures, accepted by ICCV Workshop on Real-World Recognition from Low-Quality Images and Videos, 201

arXiv.org e-Print Archive

Crossref

Queen Mary Research Online